Digital Storytelling Book Generator with MIDI-to-Singing

نویسندگان

  • Hung-Che Shen
  • Chung-Nan Lee
چکیده

Creating a digital storytelling book is an important knowledge source for the blinds, but it usually takes a lot of time and efforts. In order to read the books from electronic contents, automatic procedures could be incorporated into a speech synthesis system. In this paper, we give a practical description using a free software Text-to-speech (TTS) program with a MIDI-to-Singing toolkit as a digital storytelling book generator. In this case, a certain amount of emotional TTS customization can be derived by using time-pitch manipulation of the synthesized acoustic waveform. MIDI-to-Singing voices can be generated automatically with special emphasis on lyrical or storytelling-styled contents that are usually discouraged by uninteresting natures of voices synthesized from traditional Text-to-speech (TTS) programs. Rule-based approaches rely on rules that describe the behavior of the pitch frequency along time to generate time-pitch values. Pitch values fluctuate within a certain range depending on the intended emotion. This MIDI-to-Singing voice synthesis relies on mapping the pitch frequency values to the 12 semi-tonal melodic scales and extracting semi-tonic intervals for each emotional state. In the current version of the system, a user can style the synthesized voice by selecting either male or female standard voice in combination with one of the predefined 12 expressive styles: Neutral, Monotonic, Lowly-pitched, Highly-pitched, Rising-pitched, Falling-pitched, Happy, Sad, Fear, Anger, Randomly-pitched, and Melody-aligning (singing) styles using a small set of musical notes. A subjective test shows that synthetic conversations based on MIDI-to-Singing with customized styles are more preferable, natural, intelligible and enjoyable than the traditional ones. Finally, the result of digital talking recordings can be heard on the web-site for the comparisons between human speech and MIDI-to-Singing synthesized speech.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Outlines of Burcas - A simple concatenation-based MIDI-to-singing voice synthesis system

The present paper outlines a simple system (yet to be completed) for concatenation-based singing synthesis in Swedish. The system, called Burcas, takes as input a MIDI file (possibly holding multiple parts) for melody and a text file for lyrics, and it produces standard audio files as output. For the digital signal processing, the MBROLA speech generator is employed. Burcas consists of an input...

متن کامل

An On-the-Fly Mandarin Singing Voice Synthesis System

An on-the-fly Mandarin singing voice synthesis system, called SINVOIS (singing voice synthesis), is proposed in this paper. The SINVOIS system can receive the continuous speech of the lyrics of a song, and generate the singing voice immediately based on the music score information (embedded in a MIDI file) of the song. Two sub-systems are designed and embedded into the system. One is the synthe...

متن کامل

Unique technological voice method (The YUBA Method) shows clear improvement in patients with cochlear implants in singing.

It is known that children with cochlear implants tend to sing off-key, monotonously, and flat. There are a few reports that it is possible to improve off-key singing mainly through instruction using the falsetto voice for people with normal hearing. We examined whether their singing skills could be improved through instruction. Eight subjects (five boys and three girls aged 10.4+/-2.4 years) wi...

متن کامل

Distance Metrics and Indexing Strategies for a Digital Library of Popular Music

People identify powerfully with music: someone might say “that’s my song!” but they are unlikely to say “that’s my book!” or “that’s my picture!” A digital library of popular music therefore has the potential to be a compelling application of information retrieval technology. Such a library requires a retrieval method that is appropriate for a nontechnical audience. Experiments on “query by hum...

متن کامل

Phonetic segmentation of singing voice using MIDI and parallel speech

When analyzing singing voice signal, it is required to know the boundaries of each phonetic unit in the singing voice samples. However, due to prolonged vowels in the singing voice, it is not easy to accurately align a singing voice with the phonetic sequence of its lyrics by conventional speech recognition approach. This paper proposes a solution for the phonetic annotation of the singing voic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011